Goto

Collaborating Authors

 Samsun Province



NarraBench: A Comprehensive Framework for Narrative Benchmarking

Hamilton, Sil, Wilkens, Matthew, Piper, Andrew

arXiv.org Artificial Intelligence

We present NarraBench, a theory-informed taxonomy of narrative-understanding tasks, as well as an associated survey of 78 existing benchmarks in the area. We find significant need for new evaluations covering aspects of narrative understanding that are either overlooked in current work or are poorly aligned with existing metrics. Specifically, we estimate that only 27% of narrative tasks are well captured by existing benchmarks, and we note that some areas -- including narrative events, style, perspective, and revelation -- are nearly absent from current evaluations. We also note the need for increased development of benchmarks capable of assessing constitutively subjective and perspectival aspects of narrative, that is, aspects for which there is generally no single correct answer. Our taxonomy, survey, and methodology are of value to NLP researchers seeking to test LLM narrative understanding.


Trenton Chang 1 Lindsay Warrenburg

Neural Information Processing Systems

We consider a multi-agent setting where the goal is to identify the "worst offenders:" agents that are gaming most aggressively. However, identifying such agents is difficult without being able to evaluate their utility function. Thus, we introduce a framework featuring a gaming deterrence parameter, a scalar that quantifies an agent's (un)willingness to game. We show that this gaming parameter is only partially identifiable.






38 Best Early Amazon Prime Day Deals On Products We've Tested (2025)

WIRED

Amazon Prime Day 2025 is fast approaching, and the sale is already underway on some items. To help you find the best early Prime Day deals, we've scoured Amazon for deals on the tech we love. As always, every deal we recommend here is on a product our reviewers have personally tested and approved--you won't find any shoddy dupes or mystery brands here. This year Prime Day runs for four days, July 8-11, rather than the usual two. That means there's twice as long to suffer save. Be sure to read our explainer on all the Amazon Prime perks you should be taking advantage of.

  Country: Asia > Middle East > Republic of Türkiye > Samsun Province > Samsun (0.04)
  Industry:

Incentive-Aware Machine Learning; Robustness, Fairness, Improvement & Causality

Podimata, Chara

arXiv.org Artificial Intelligence

Machine Learning (ML) algorithms are deeply embedded in var ious aspects of modern life, influencing everything from enhancing daily conveniences and sh aping online purchasing behavior to making critical decisions in areas such as hiring, loan appr ovals, college admissions, and probation rulings. Given the high stakes of these decisions, individu als often have strong incentives to strategically modify the data they provide to these algorithms to s ecure more favorable outcomes. For instance, individuals might open additional credit accoun ts or take other steps to improve their credit scores before applying for a loan. In the context of co llege admissions, applicants may retake standardized tests like the GRE, enroll in test preparation courses, or even switch schools to boost their class rankings, all in efforts to present themselves as m ore competitive candidates. Such instances of "strategic adaptation" have been extensi vely documented across disciplines including Economics, CS, and Public Policy Bj orkegren et al. [ 2020 ], Dee et al. [ 2019 ], Dranove et al. [ 2003 ], Greenstone et al. [ 2022 ], Gonzalez-Lira and Mobarak [ 2019 ], Chang et al. [ 2024 ]. The challenge arises when decision-makers deploying ML algorithms fail to account for these adaptations, potentially undermining the original goals of the policies the algorithms are intended to support. For example, in college admissions, a student's decision to change schools solely to improve their class ranking may not necessarily reflect a substantive impr ovement in their qualifications. This literature review was recently published in SIGEcom Ex changes.


Machine Learning Should Maximize Welfare, Not (Only) Accuracy

Rosenfeld, Nir, Xu, Haifeng

arXiv.org Artificial Intelligence

Decades of research in machine learning have given us powerful tools for making accurate predictions. But when used in social settings and on human inputs, better accuracy does not immediately translate to better social outcomes. This may not be surprising given that conventional learning frameworks are not designed to express societal preferences -- let alone promote them. This position paper argues that machine learning is currently missing, and can gain much from incorporating, a proper notion of social welfare. The field of welfare economics asks: how should we allocate limited resources to self-interested agents in a way that maximizes social benefit? We argue that this perspective applies to many modern applications of machine learning in social contexts, and advocate for its adoption. Rather than disposing of prediction, we aim to leverage this forte of machine learning for promoting social welfare. We demonstrate this idea by proposing a conceptual framework that gradually transitions from accuracy maximization (with awareness to welfare) to welfare maximization (via accurate prediction). We detail applications and use-cases for which our framework can be effective, identify technical challenges and practical opportunities, and highlight future avenues worth pursuing.